Skip to content

feat: cross-platform port — Windows/WSL2/Linux + CUDA (RTX 2060+)#2

Open
zapabob wants to merge 4 commits intot8:mainfrom
zapabob:windows-cuda-port
Open

feat: cross-platform port — Windows/WSL2/Linux + CUDA (RTX 2060+)#2
zapabob wants to merge 4 commits intot8:mainfrom
zapabob:windows-cuda-port

Conversation

@zapabob
Copy link
Copy Markdown

@zapabob zapabob commented Mar 23, 2026

Summary / 概要

  • Windows native / WSL2 / Linux + CUDA (RTX 2060+) support
  • Windows native / WSL2 / Linux + CUDA (RTX 2060+) 対応
  • macOS Metal/Apple Silicon support preserved / macOS Metal/Apple Silicon サポートはそのまま維持
  • NVMe streaming works on Windows via FILE_FLAG_NO_BUFFERING + ReadFile(OVERLAPPED) (equivalent to F_NOCACHE + pread)
  • NVMe ストリーミングを Windows でも動作 (FILE_FLAG_NO_BUFFERING + ReadFile(OVERLAPPED)F_NOCACHE + pread)

Key changes / 主な変更

New file / 新規ファイル

  • src/io/compat.rs — Platform abstraction layer unifying open_direct_fd / read_at_fd / alloc_pages / free_pages / advise_free_pages across macOS, Linux, and Windows
    プラットフォーム抽象化レイヤー。macOS・Linux・Windows の直接I/O・メモリ管理APIを統一

Modified / 変更ファイル

File Change (EN) 変更内容 (JP)
src/io/aligned_buffer.rs posix_memalignstd::alloc::Layout (cross-platform) クロスプラットフォーム化
hypura-sys/build.rs Metal / CUDA / CPU three-way build; CUDA auto-detect; dunce::canonicalize fixes \?\ UNC path breaking MSBuild; pre-generated bindings fallback (see below) Metal/CUDA/CPUの3系統ビルド・CUDA自動検出・UNCパス修正・事前生成バインディング対応
hypura-sys/src/hypura_buft.c #ifdef _WIN32 guards for VirtualAlloc/VirtualFree Windows メモリ管理条件分岐
src/profiler/cpu.rs macOS sysctl / Linux procfs / Windows sysinfo OS別ハードウェア検出
src/profiler/gpu.rs NVIDIA GPU spec DB (RTX 20/30/40/50 + A100/H100/L40S) NVIDIA GPU スペックDB
src/profiler/storage.rs Windows storage benchmark fallback Windowsストレージベンチフォールバック
src/compute/nvme_backend.rs All libc I/O → compat module libc I/O を compat 経由に統一
src/compute/inference.rs sysctl → sysinfo cross-platform クロスプラットフォーム化
src/io/async_reader.rs compat module compat 経由に統一
src/cli/iobench.rs compat module + dead code removed compat 化 + デッドコード除去
src/scheduler/placement.rs Per-OS OS_OVERHEAD / GPU_RUNTIME_OVERHEAD constants OS別オーバーヘッド定数
Cargo.toml windows-sys conditional dep Windows 条件付き依存追加
README.md Bilingual (JP/EN), Windows/WSL2 install instructions 日英バイリンガル・Windows/WSL2手順追加
.gitignore Exclude .cargo/config.toml (machine-specific) and .claude/ 機械固有設定ファイルを除外
_docs/ Dated implementation logs 日付付き実装ログ

Pre-generated bindings fallback / 事前生成バインディング対応

bindgen requires libclang at build time, which is not always available (e.g., CI without LLVM installed).
bindgen はビルド時に libclang を必要とするが、CI環境など LLVM が入っていない場合がある。

This PR adds a two-level fallback in hypura-sys/build.rs:
本PRでは hypura-sys/build.rs に2段階のフォールバックを追加:

  1. HYPURA_PREGENERATED_BINDINGS=/path/to/bindings.rs env var — point to pre-generated file
    環境変数で事前生成ファイルを指定
  2. hypura-sys/bindings.rs in source tree — committed pre-generated bindings (zero LLVM dependency)
    ソースツリーに bindings.rs をコミットすればLLVM不要でビルド可能

To generate and commit bindings once (on a machine with LLVM):
一度だけ生成してコミットする手順:

# Windows: winget install LLVM.LLVM  →  set LIBCLANG_PATH=C:\Program Files\LLVM\bin
cargo build
cp $(cargo build --message-format=json 2>/dev/null | grep -o '"out_dir":"[^"]*"' | head -1 | cut -d'"' -f4)/bindings.rs hypura-sys/bindings.rs
git add hypura-sys/bindings.rs && git commit -m "feat(build): commit pre-generated bindings"

Platform support / プラットフォームサポート

Platform / プラットフォーム GPU NVMe I/O
macOS (Apple Silicon) Metal F_NOCACHE + pread
Windows native CUDA RTX 2060+ FILE_FLAG_NO_BUFFERING + ReadFile
WSL2 CUDA RTX 2060+ posix_fadvise + pread
Linux CUDA RTX 2060+ posix_fadvise + pread

CUDA architectures / CUDAアーキテクチャ

Default / デフォルト: 75;86;89;90 (RTX 20xx–H100).
Override / 変更: HYPURA_CUDA_ARCHITECTURES=86


Test plan / テスト計画

  • cargo check → zero warnings / 警告ゼロ
  • macOS: cargo build --release → Metal build / Metal ビルド成功
  • Windows/WSL2: cargo build --release → CUDA build (verified RTX 3060/3080 sm_86) / CUDA ビルド成功
  • hypura profile → hardware detection / ハードウェア検出
  • hypura iobench ./model.gguf → NVMe streaming on Windows / Windows NVMe ストリーミング

🤖 Generated with Claude Code

zapabob and others added 3 commits March 23, 2026 18:45
Add platform abstraction layer (src/io/compat.rs) unifying all direct I/O
primitives across macOS, Linux/WSL2, and Windows. NVMe streaming now works
on Windows via FILE_FLAG_NO_BUFFERING + ReadFile(OVERLAPPED) — equivalent
to F_NOCACHE + pread on macOS.

Key changes:
- src/io/compat.rs: NativeFd type alias + open_direct_fd/read_at_fd/
  alloc_pages/free_pages/advise_free_pages for all platforms
- src/io/aligned_buffer.rs: posix_memalign → std::alloc::Layout
- hypura-sys/build.rs: Metal/CUDA/CPU three-way build with CUDA
  auto-detection and dunce::canonicalize (fixes \?\ UNC path on Windows)
- hypura-sys/src/hypura_buft.c: #ifdef _WIN32 VirtualAlloc/VirtualFree
- src/profiler/{cpu,gpu,storage,mod}.rs: cross-platform hardware detection
  + NVIDIA GPU spec DB (RTX 20/30/40/50 + A/H series)
- src/compute/{nvme_backend,inference}.rs: compat module + sysinfo
- src/scheduler/placement.rs: per-OS OS_OVERHEAD/GPU_RUNTIME_OVERHEAD
- Cargo.toml: windows-sys conditional dependency
- README.md: bilingual (Japanese/English), Windows/WSL2 install instructions
- _docs/: dated implementation logs

CUDA architectures: sm_75 (RTX 20xx), sm_86 (RTX 30xx), sm_89 (RTX 40xx),
sm_90 (H100). Override with HYPURA_CUDA_ARCHITECTURES env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ific configs

- hypura-sys/build.rs: check HYPURA_PREGENERATED_BINDINGS env var or
  hypura-sys/bindings.rs before invoking bindgen, enabling builds on
  machines without LLVM/libclang installed; improve error message to
  guide users toward the fix
- .gitignore: exclude .cargo/config.toml (LIBCLANG_PATH is machine-
  specific) and .claude/ (local IDE settings); remove accidentally
  committed .claude/settings.local.json from tracking
- _docs/: add implementation log for the libclang Windows fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Disable llama.cpp tool targets in hypura-sys CMake configuration to avoid building unsupported multimodal CLIs on Windows, and ignore local temporary build artifact directories.

Made-with: Cursor
@zapabob
Copy link
Copy Markdown
Author

zapabob commented Mar 23, 2026

Windows update: pushed commit 1fac4fd to head branch (includes LLAMA_BUILD_TOOLS=OFF in hypura-sys/build.rs). Local rebuild progressed through CUDA+CMake/MSBuild on Windows with the tools target disabled; continuing final smoke verification on my side.

Use isolated target-dir build runs and retain Windows runtime/link fixes while documenting redirected serve logs and successful /, /api/tags, /api/generate smoke results.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant